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Priority service disciplines are widely used in computer and com- 
munications systems. Many such systems can be modeled by queuing 
networks, but presently developed theory does not allow solution of 
these models when priority service disciplines are present. For priority 
queuing networks that have a homogeneity property, we give some 
explicit results for mean delay and throughput. However, the as- 
sumption of homogeneity is too restrictive for many applications. We 
identify some examples of systems for which inhomogeneous two-node 
priority queuing networks are appropriate models and yield to exact 
analysis. The results allow some conclusions to be drawn about using 
priorities in a two-node closed network to establish grades of service. 
We also use the results to evaluate a commonly used approximation 
technique for priority queuing systems. 

I. INTRODUCTION AND SUMMARY 

Priority service disciplines are widely used in computer and com- 
munication systems. One common application of priorities is in the 
establishment of multiple grades of service whereby deferrable or 
background work is scheduled according to a lower priority. In other 
applications, a device may give prioritized service to a class of jobs 
known to be short so as to increase overall system throughput. For 
purposes of performance analysis, computer and communication sys- 
tems have often been modeled as queuing networks. However, the 
theory of queuing networks in its present form (see Refs. 1, 2) does not 
provide solutions for even simple networks with priority disciplines, 
except in an approximate manner. 3 " 5 

There are very few exact results for queuing networks (i.e., queuing 
models with more than one service station or node) with priority to be 
found in the literature. One known result concerns a general service 
time, single-server queue with preemptive or nonpreemptive priority 
and finite exponential source. 6 This model can be thought of as a two- 
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node closed queuing network where the second node is a pure delay or 
infinite-server group. In a paper by Avi-Itzhak and Heyman the mean 
cycle times were obtained for a central server model with priorities, 
under the assumption that the mean service times and routing patterns 
are the same for each priority class. 7 In other computer and commu- 
nication applications, network priorities have only been represented 
approximately, using heuristics for the central server model 3,4 and a 
packet switching network. 5 While these approximation techniques may 
be adequate in accuracy for the parameter ranges of some applications, 
much further work is needed in improving and validating these tech- 
niques and, ultimately, in developing exact analytical results wherever 
they are tractable. 

Our goals are to obtain insight into the solution form for some simple 
cases of queuing networks with priorities, to obtain an initial evaluation 
of the accuracy of existing approximation techniques, and to draw 
some conclusions on the performance of some simple network priority 
structures occurring in practice. In Section II, we describe a general 
class of priority queuing networks that are homogeneous in the sense 
that all customer classes are treated identically with respect to service 
time and routing. For homogeneous networks, we are able to give a 
mean delay and throughput analysis. However, it will be seen that the 
homogeneity assumption is sufficiently restrictive as to prevent appli- 
cation of these results in many situations. Subsequently, we focus on 
two specific examples of systems that can be modeled by simple 
queuing networks, but in which priority disciplines and inhomogeneity 
play crucial roles. These examples suggest several different two-node 
priority queuing network models that yield to exact analysis. 

The first example we consider is a computer system consisting of a 
central processing unit (cpu) and an input/output (i/o) device, which 
processes both time-critical transactions, as well as nontime-critical 
batch jobs. The system is designed to give priority to the transactions 
at both the cpu and i/o device (in contrast to Refs. 3 and 4 where it 
is assumed that only the cpu observes priorities). This suggests use of 
the two-node, closed queuing network model A of Fig. la, with one 
node representing the cpu, and the other, the i/o device. The model 
has separate queues for each priority class at each node, and priority 
is observed preemptively at both nodes. 

The second example is a full-duplex data link which is used for 
transmission of messages under a window flow control protocol. There 
are two grades of messages, the premium grade and the standard 
grade. When both premium grade messages and acknowledgments 
receive preemptive priority, model A is applicable (refer to Fig. 5 
which is explained further in Section VI). However, since acknowledg- 
ments are typically shorter than messages, another configuration is 
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Fig. 1 — Schematics of models A and B. 

suggested wherein acknowledgments of either grade are given preemp- 
tive priority; this leads to model B shown in Fig. lb. 

In both models A and B there are a fixed number of customers in 
each class and service time distributions at a node are assumed to be 
exponential, but are not required to be the same at a node for each 
customer class (in contrast with homogeneous networks or the first- 
come first-served nodes described in Ref. 1). Because of the exponential 
assumption, the priorities can be understood to be either preemptive- 
resume or preemptive-restart (with resampling). For each model, ser- 
vice within a customer class at a node can be thought of as first-come 
first-served, but all equations and results remain valid for any other 
discipline within the priority class which does not take into account 
the actual service time requirement when selecting a customer for 
service. 

The general approach we use in the analysis of models A and B, and 
several similar models, is to set up the balance equations (steady-state 
Kolmogorov forward equations) for the Markov chain describing the 
number of customers of each priority class at each node. These partial 
difference equations generally do not satisfy the well-known local 
balance condition 1-2 but, nevertheless, can still be solved to obtain the 
stationary distribution. This distribution allows throughput and mean 
delay to be computed for each customer class. 
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These results can be applied to obtain some general conclusions 
about the two systems we have used as motivating examples. In the 
computer system that processes both transactions and batch jobs, we 
find that if the transactions are bottlenecked at one device (cpu or 1/ 
o), the batch jobs need to be even more strongly bottlenecked at the 
other device, if a significant batch throughput is to be attained. 
Specifically, we show that if the transactions have a bottleneck of 
strength x at one node, the batch jobs need to have a bottleneck of 
strength x N at the other node (where N is the transaction multipro- 
gramming level), if the batch jobs are to be able to fulfill a role as 
"filler" work. In the data link example, we find a similar result: if 
standard grade message traffic is carried in purely a background mode, 
fairly extreme parameters are necessary before its introduction be- 
comes attractive. On the other hand, if some compromise of the 
premium traffic performance is permitted, then an appreciable amount 
of standard grade message traffic can be carried by using data link 
capacity that would otherwise be wasted. For each system, we identify 
hazards that can occur when the lower-priority work is allowed to 
interfere with higher-priority work. Refer to Sections V and VI for 
further details. 

In Section VII, we use the results to evaluate the effectiveness of a 
well-known approximation technique. We find that accuracy of the 
approximation technique varies from good to poor, depending on the 
parameters of the application. A criterion on the application param- 
eters is proposed under which the approximation technique would be 
expected to perform well. 

II. HOMOGENEOUS NETWORKS 

In this section, we consider a class of queuing networks that allow 
preemptive priorities but are otherwise homogeneous in the sense that 
at any one node all customers are treated identically with respect to 
service rate and routing. The results rely on an observation similar to 
that made by Avi-Itzhak and Heyman in their analysis of the central 
server model. 7 

We first consider a closed queuing network of the Gordon-Newell 
type. 8 It consists of N service centers or nodes numbered 1, ••• ,N.ln 
departure from the Gordon-Newell formulation, there are P priority 
classes numbered 1, • • • , P, with the ith (priority) class containing Ki 
customers, i = 1, • • ■ , P and 

I Ki = K. 

1=1 

At any node, a customer from a higher-numbered class takes preemp- 
tive priority over a customer from a lower-numbered class. The service 
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time distribution at nodey is exponential with rate fij for all customers, 
and the service discipline is first-come first-served within each priority 
class. After service at a node is completed, routing to another node is 
governed by a probability vector which is the same for each priority 
class. Let the state of the network be expressed by the quantities n), 
j = 1, • • • , N, i = 1, • • • , P, where n) denotes the number of class i 
customers present at node j. Define the aggregate state variable 

p 
m) = £ n'j, 

the number of priority class i or higher customers at node j. The key 
observation is that the random variable m) is equivalent to that which 
would result if the network were modified by first removing customers 
of priority less than i (i.e., by setting K/, = 0, 1 < k < i) and, thereafter, 
ignoring all priority service distinctions. This is because (i) lower- 
priority customers exert no influence on higher-priority customers, and 
(ii) regardless of whether priorities are observed between customers of 
classes i, i + 1, • • • , P, transitions in the total number (m)) of customers 
of priority i, or larger, at a node are not altered (by the assumed 
uniformity of service rate and routing over priority classes). Thus, we 
can find the stationary distribution of m) by the usual closed queuing 
network techniques. I,2,H9 The stationary distribution of the aggregate 
variable m) is sufficient to determine the steady-state mean delay and 
throughput of each priority class at each node. This follows from the 
fact that for each i and j 

E[n)} = E[m)} - E[mf l l 
Pr[nj > 0, m; +l = 0] = Pr[mj > 0] - Pr[mj + ' > 0J, 

where m,j i is understood to be identically zero. Hence, priority class 
i customers have a throughput at node j of 

T'j = fiy[Pr(roj > 0) - Pr(mf ' > 0)] 

and a mean delay (including service time) of 

Z>; = [E(mj)-£(m; +l )]/TJ 

by Little's Law. Note that these quantities are obtained in the process 
of carrying out the mean value analysis for a single-chain closed 
network with K customers. 9 

Similar results are obtained for an open network of the Jackson 
type. 10 All notation is the same as for the closed network, except that 
we no longer specify the number of customers of each priority class 
but, instead, we specify the rate Aj of exogenous Poisson arrivals of 
priority class i to node / We must now assume that the traffic 
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equations admit a unique solution e) representing the mean arrival 
rate of customers of priority class i to node / and that 

le)<tij 

for each / We make the analogous observation that the quantity m) 
can be obtained by considering the network modified by turning off all 
arrival streams of priority less than i (i.e., setting \j « 0, 1 2S k < i, 
1 <jr 5 N) and, thereafter, ignoring priority distinctions at service. We 
then have 

E[n'j] = E[m l j] - E[m? 1 ], 



and 



Now, 



T5 = e), 
DJ = [E(mj)-E(m? 1 )]/ei. 



and, therefore, 



HJ 



-i 



i - i pi) (i - s pi 



where p* = ef/fij is the utilization of node j due to class k customers. 
We, thus, recognize the validity in a network context of the Cobham- 
type formula originally obtained by White and Christie 11 for the delay 
in an isolated preemptive priority M/M/l queue when the service 
times are the same for each priority class. 

Within the stringent limitations imposed by our homogeneity as- 
sumptions, some extensions to these results are possible. For example, 
we can allow the more general service disciplines shown by Kelly (see 
Ref. 2, pages 58 and 78) to lead to product form provided (i) the state 
dependence and server sharing embodied in these disciplines extend 
only to the customers of highest priority present at a node and ignore 
all lower-priority customers, and (ii) all customers are treated identi- 
cally with respect to service time and routing. 

The above results might possibly be useful in some queuing network 
applications. For example, a first-cut evaluation of the impact of 
introducing data packet priorities into a packet switching network 
could be carried out by representing exogenous packet arrivals as 
Poisson and data links as exponential servers, 12,13 and by assuming that 
the mean data packet length and the traffic routing pattern are the 
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same for each priority class. If short control (e.g., acknowledgment) 
packets were to be given priority over data packets, we find that our 
homogeneity assumptions would be violated, although the effect of the 
short packets could be approximated in several ways. 513 Indeed, the 
case of control packets receiving priority serves to illustrate a common 
situation in which customers receiving priority have significantly 
smaller service time requirements. Thus, the results described in this 
section are expected to find limited use. The remainder of this paper 
does not make any such homogeneity assumption; unfortunately, by 
relaxing this assumption, we are able to treat only networks consisting 
of two nodes. 



III. MODEL A: TWO NODES WITH PRIORITIES THE SAME AT EACH 
NODE 

We now consider the two-node closed queuing network introduced 
in Section I as model A and shown in Fig. la. The network consists of 
two nodes — the left- and right-hand nodes. There are N high-priority 
and M low-priority customers. High-priority customers take preemp- 
tive priority over low-priority customers at each node. All service times 
are assumed exponentially distributed: the high-priority customers 
have a mean service time of v~ x at the left node and 1 at the right; low- 
priority customers have a mean service time of ju _1 at the left node and 
A -1 at the right. After service at one node is completed, customers are 
immediately routed to the other node without changing class. We 
assume v, //., A, N, and M are all positive. 

The state of the system is described by the vector (n, m) where n 
(respectively m) is the number of high- (respectively low) priority 
customers at the left node. The state (n, m) evolves as a Markov chain 
with stationary distribution p(n, m). It is obvious that the stationary 
distribution is also the limiting distribution since the chain is finite 
and irreducible. The transitions of (n, m) are shown in Fig. 2a. 

By definition, p(n, m) satisfies the balance equations 

p{n, m)[vl(„>0) + jUl(„=0.m>0) + l(n<N) + Al(„=N.n,<M)] 

= p(n — 1, m) + p(n + 1, m)v + p(N, m — l)Al( n =/v} 

+ p(0, m + 1)ju1,„=o), 0<n<N, 0<m<M, (1) 

where 1 ( ) denotes the indicator function which has value 1 (respec- 
tively 0) when the predicate within the braces is true (respectively 
false). Note that we are adopting the convention thatp(n, m) = when 
(n, m) <£ [0, N] x [0, M]. 

We wish to solve forp(n, m). The technique we use is best explained 
by reference to the state transition diagram shown in Fig. 2a. First 
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Fig. 2 — (a) State transition diagram for model A shown for N = 4, M = 3. (b) State 
transition diagram for model A, but with left node nonpreemptive, shown for N = 4, 
M=3. 

note that for any n > 0, p(n, m), < m < M is expressible in terms of 
p(n - 1, m), < m < M. Hence,p(n, m), < m < M, < n < N can be 
expressed in terms of the left boundary values p(0, m), < m < M, and 
solution for p(0, m), < m < M rests on the balance equations for the 
right boundary p(N, m), < m < M. This observation is generally true 
for an arbitrary two-dimensional birth-death process (provided all 
right-to-left horizontal transitions are present) but not always useful 
since the resultant equations for p(0, m), < m < M are not easily 
solved. Fortunately, in our case, the absence of vertical transitions 
from states (n,m),0<n<N caused by the priority structure results 
in relations for/>(0, m) which comprise a simple difference equation of 
order two which is easily solved and yields an explicit solution for p(n, 
m). Before proceeding with this technique, we mention that a compu- 
tational method based on such an observation has been proposed in 
which the recursive structure is used to reduce the problem of finding 
the stationary distribution of certain N X M birth-death processes to 
the solution of N equations in N unknowns. 14 

For notational ease, define a(m) = p(0, m), < m < M. Writing 
eq. (1) for n = yields 

(2) 

< m < M (3) 

(4) 



p(l, 0) = a(O)^ 1 - a(l)fiv-\ 
p(l, m) = a(m)(n + l)*" 1 - a(m + D/w" 1 , 
p(l,M) = a(M)( l i+l)p-\ 



and for < n < N 

p(n, m)(v 4- 1) =p(n + l,m)v + p(n- 1, m), 0<m<Af (5) 
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for which the general solution is 

p(n, m) = (a(m)(p- n - v~ l ) + p{l, m)(l - v~ n ))/{l - p~\ 

< n < N, < m < Af, (6) 

provided j> ^ 1. Hence, the problem is reduced to determination of 
a(m), < m < M. This is done by writing eq. (1) for n = N, 

p(N,0)(p + X)=p(N-l,0), (7) 

p(N, m)(p + X) = p(N - 1, m) + p(N, m - 1)X, < m < M, (8) 

p(AT, M)^ = p(AT - 1, M) + p(N, M - 1)X. (9) 

Substituting eqs. (2) and (6) into eq. (7), 

a(l)/a(0) = (\/fi)[f- N (p - 1)]/{p + A - 1 - hr"). (10) 

Assuming M > 1, taking m = 1 in eq. (8), and using eqs. (2), (3), and 
(6) yields a(2)/a(l) = r, where 

A p-fr -,+ !),-» 
r M ,, + x-l-A^- U1) 

Note that the denominator of r is not zero because v 7* 1. Taking 
1< m < M in eq. (8) with eqs. (3) and (6) yields the difference equation 

a{m + l)[[iKv~ N + fi — fi\ — fiu] 

+ a(m)[np + Xv l ~ N + 2A/i - 2X l iv~ N - Xv~ N - n] 

+ a(m - l)[Xfip~ N + Xp~ n -X(i- Xp 1 ~ n ] = 0, 1< m < M, 

which has characteristic roots 1, r. But since a(2)/a(l) = r, we have 

a(m) = a(l)r m -\ 1 < m < M. (12) 

It can now be verified that these results are consistent with the one 
unused eq. (9) and that the result holds true for M = 1. 

Substituting eqs. (2) to (4), (10), and (12) into eq. (6) and simplifying 
yields the general solution for p t 6 1 

p(n, m) = Cr m [rp- nS{m ' M) - ^ N -" )S < m) + ((1 - v )yT l 

+ 1 - r)p~ n l 0<n<N, 0<m<M, (13) 

where r is given by eq. (11) and S(-) is the Kronecker delta: 5(0) = 1; 
8(k) = 0, k ¥* 0. The normalizing constant C is obtained by demanding 
that p(n, m), S n < N, < m < Af is a probability distribution, 
yielding 

C = ^il^p , ( 14 ) 

(1 - if*- 1 ) 



M 


- 


il-p) J r' + M (l- 

1=0 


P N ) 
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In eq. (14) and below we refrain from summing geometric series of the 
form 

2*' 

i=0 

to avoid a special statement for the case x = 1. 

Our treatment has excluded the case v = 1 for which the solution to 
eq. (5) isp(n, m) = a(m)(l - n) + p(l, m)n. Rather than resolving for 
this case, it is easier to take the limit as v — > 1 in eqs. (13) and (14); this 
must yield the correct solution since the coefficients in eq. (1) are 
continuous in v and the eigenvectors of a matrix are continuous 
functions of its elements. Equations (13) and (14) forp(n, m) can be 
rewritten in a form which is well-defined for v — 1: 

n-l 

p(n, m) = Ds' 



/TV + (1 - s)p~ 1 £ p" 

i=0 



+ 1 



(m=0) 



N-n-l 

J] V 1 + l (m =A#)SP 



n-1 
i-0 



where 



X p- + \- l r I 



s=[ 2 p~' + M" V 



Z> _1 = Z ^' + M _1 S •' X""'. 



1=0 
M \ N 



(15) 

(16) 
(17) 



and summations over descending ranges are taken to be zero. Since 
the solution given by eqs. (15) to (17) is continuous in v > 0, it is the 
general solution for all v > 0. 

Remarks 

(i) The system considered here can be generalized trivially to 
allow mean service time of the high-priority customers at the right 
node to be k -1 (rather than unity) and to allow routing of a customer 
back to the node at which it has just completed. Let high-priority 
customers on completion at the left (respectively right) node be routed 
to the right (respectively left) node with probability p tr (respectively 
p r i) and be routed to the node at which completion has just occurred 
with complementary probability 1 — pi r (respectively 1 - p r i). Let the 
low-priority customers have similarly defined probabilities qw, q r i. 
Then the results in eqs. (13) to (17) remain valid with v, /*, A replaced, 
respectively, by vpi r /Kpi r , pqir/icpri, Xq r i/Kp r i. 

(ii) It is readily verified that the marginal distribution of the high- 
priority customers p(n, •) = Y$=op(n, m) agrees with the result for an 
ordinary two-node closed network, viz. 
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p(n, -) = u-" 



1-0 



Of course, this must be the case, since the high-priority customers 
experience no interference from the low-priority customers. 

(Hi) Let Th and 77 denote the throughput of high- and low-priority 
customers, respectively. Then 



T h =p(1- p(0, •)) = ? 



I" I 



T L = ii[p(0, -)-p(0,0)] 

M I /N-\ M \ N 

J i ..-1 



and 



-1*7 (I s + ris'] I**. 

i=l / \ 1=0 1=0 / 1=0 

If the generalization referred to in (i) is adopted, then as well as the 
replacements specified in (i), the quantities Th, Tl must be multiplied 
by kph if they are to have the units of customers per unit time. 

Mean delay formulas for high- and low-priority customers at each 
node are immediately obtained from p(n, m) by Little's law, but are 
omitted here. 

(iv) In the special case that ju = v and X = 1, i.e., service times do 
not depend on the priority class, the throughputs can be obtained from 
the considerations of Section II. In that case, the distribution of n + m 
will be the same as a two-node closed network with N + M customers 
and no priorities. Hence, we can immediately write down 



T H =v 



T L = v 



1- 



1- 



N 

«'=0 

N+M 
1 ' 



-77 



'N+M 

-IS'" 



and this checks with the result in (Hi). 

Note also that in this case (/i = v, \ = 1), then s = v~ x and 
p(n, m) = Dv~ l ~ m , < m < M . Thus, if we observe the system only at 
instants when there is at least one low-priority customer at each node, 
the distribution of high-priority customers is seen to be uniform. 
(v) When s > 1, using the expressions in (Hi) 

lim 1 = 1 , 

Af-.oc V fi 

i.e., the utilization of the left-hand node approaches unity as the 
number of low-priority customers increases. It follows from the defi- 
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nition of s that s > 1 if and only if A/ju > v N . Using this fact and 
reversing the two nodes shows conversely that if s < 1, utilization of 
the right-hand node approaches unity as M — > oo. Thus, whether s > 
1 or s < 1 determines which node becomes the limiting factor in trying 
to obtain increased total throughput by introducing additional low- 
priority customers. 

The following criterion can be deduced. Suppose a two-node system 
with N circulating customers has a bottleneck of strength v, v > 1 at 
the right node. For moderate values of N, the right node will be almost 
completely utilized and the left node underutilized. If the low-priority 
customers have a bottleneck at the left node of strength at least v N , 
i.e., A//u > v N , then the left node can have a utilization as close as 
desired to unity by introduction of sufficiently many low-priority 
customers. If the low-priority customers have a bottleneck weaker 
than p n at the left node, then complete use of the left node can never 
be achieved by introducing low-priority customers. This rule of thumb 
can be deduced intuitively as follows. The high-priority customers can 
be thought of as causing a reduction of processing rate to /i[l - T H / 
p] at the left node and to A[l - T H ] at the right node. Thus, the left 
node can be fully utilized if and only if ju(1 - T H /v) < A(l - T H ) which 
reduces to the condition A//a > v N . Of course, the formulas for Th and 
T L make precise the actual throughputs achieved as a function of the 
parameters. This is illustrated in Section V by an example. 

(vi) The same solution technique can be used to obtain results for 
more than two priority classes, although the solution complexity 
increases. We state the result for a three-class, two-node system with 
number in each class N, M, L (in order of decreasing priority), with 
service time at the left node /T 1 (for all classes) and 1 at the right, 
(X 9^ 1. If p(n, m, I) describes the stationary probability of having 
7i, m, I customers at the left (in order of decreasing priority), then, 
provided N, M, L, /x are positive, 

6(ju- n - jT 1 "*), 1 = 0, m = 

B/x-V - M -1 ""). 0</<L, m = 



p(n, m, I) = 



B(jM - m _ B[1 - 



1=0, 0<m<M 



where 



B(l - 11'%-'-", 0<1<L, 0<m<M, 

Bfi-'ifi-" - [i-"""- 1 ), l = L, 0<m<M 

Bfi-'d - ti-"- } ), < I < L, m = M 

bfT M - N - L (l ~ /i" n_1 ), l = L, m = M 

B = b(l- /x)/(l - fi M+N+1 ) 
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and 

b = (l - /T l )/[(i - (t-^Hl - fi-"-"- 1 - 1 )]. 

A result for three priority classes is sometimes useful in that it allows 
comparison of the performance of a designated customer class with 
two aggregated classes: one representing those customer classes of 
higher priority and the other those customer classes of lower priority. 

(vii) A variation on model A which will be of interest in Section VI 
is the case where priority is again preemptive at the right node but 
nonpreemptive at the left node. We use the same notation as above, 
except that the state description now becomes (n, m, s), where s = H 
(respectively L) when the left node is processing high- (respectively 
low) priority customers. The state transition diagram for this model is 
shown in Fig. 2b where transitions out of the transient states 
{(n, 0, L), < n < N; (0, m, H), < m < M) are omitted and we have 
adopted the convention that s = H when n = m = 0. The stationary 
probabilities p(n, m, s) can be solved for by a technique similar to that 
used above. Namely, letting p(N, m, L) = a(m), 1 < m < M, p(0, 0, H) 
= a(0) and writing balance equations for all states with s = L, yields 
expressions for p(n, m, L), 1 < m < M, < n < N and p(l, m, H ), < 
m < M in terms of a(-). The balance equations for states (n, m, H), 
<m<M, l</i<N yield p(n, m, H), < m < M, 1 < /i < N, again in 
terms of a{-). The analysis is completed by solving the third-order 
constant coefficient difference equation for oc( • ) that is obtained from 
the balance equations for states (N, m, H), < m < M. This approach 
leads to a solution which, although closed form, is of limited use 
because of its complexity. 

Instead, we now briefly describe a much simpler approximate solu- 
tion which appears justifiable for our applications. The system is 
approximated by omitting the transitions shown by dashed lines in 
Fig. 2b, i.e., (N, m, L) -> (N, m + 1, L), < m < M. With these 
transitions omitted, the solution steps simplify and we obtain 

p(n, m, L) = fi{m)(p - l)(/i + lP'-'U + J u" l )' i " , "' v, , 

l<m<Af, 0<n<iV, 

p(n, m, H) = /?(m)(l,„ 1>( „ - v~ n ) + (${m + 1) 

• {-1 + \iiv- n -(n+ l)- n ( V - 1)]/ (n - v + l)}l, ffl < M) , 

0<m<M, l<n<AT or m = n = 0, 
where 

Pirn) = Ct m -\\ ~ i> s )* {m) , 

Ml- y- N )(ii-v+ 1) 

(/* - v + 1)(p + X - 1) + Mix + ir N (v - 1) - XfiP~ N ' 
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t = 



C is a normalizing constant, M > I, N > 1, p ¥* I, v i* fi + 1, and all 
unspecified probabilities are zero. As before, the cases v = 1, v = 
1 + fi are obtained by taking limits. 

The omission of the dashed transitions amounts to a denial of service 
to low-priority customers at the right node when n = N and s = L. 
This is a justifiable approximation when the probability is small that 
N high-priority customers and one low-priority customer can be served 
at the right in less time than it takes to serve one low-priority cus- 
tomer at the left. This condition is stated as (1 + n)~ N X/(X + /i) « 1, 
and this expression is shown to hold for our applications in Section 
VI. The approximation causes an underestimation of low-priority 
throughput and an overestimation of high-priority throughput. 
Although we omit details here, the opposite bounds (an upper 
bound for low-priority throughput and a lower bound for high- 
priority throughput) could also be obtained by replacing the transition 
(N, m, L) -> (N, m + 1, L) by (N, m, L) -+ (N, M, L), < m < M, 
leading, in turn, to a system which is solved the same way. 

IV. MODEL B: TWO NODES WITH PRIORITIES REVERSED 

Model B, shown in Fig. lb is now considered. This queuing network 
differs from model A only in that the priority at the right node has 
been reversed. We now refer to N type-1 customers and M type-2 
customers (see Fig. lb) with ^(respectively m) being the number of 
type 1 (respectively type 2) customers at the left node. We assume 
N, M, v, n, X are all positive. 

The state transition diagram for this model is shown in Fig. 3. One 
immediately recognizes that states {(n, m): n > and m < M] are 
transient, i.e., in equilibrium, one of the two high-priority queues is 
always empty. This is also easily deduced by considering system 
behavior at instants after the state (0, M), i.e., both high-priority 
queues empty, is reached. One of the low-priority queues completes a 
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Fig. 3 — State transition diagram for model B shown for N = 4, M = 3. 
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service and then that customer type prevents service of the other 
customer type until the state (0, M) is again attained. 

Let p(n, m) be the stationary probability of state (n, m). Using the 
observation as to which states are persistent, we have immediately 
that 

p(0, m) = C(\/n) m , < m < M and 

p(n, M) = C(X/ t i) M p-", < n < N, 



where 



M=\ N 

C" x = I (A//ir + (A/ M ) M £ v~ n 

m=0 /i-0 



and all other probabilities are zero. The values of p(n, m) are also the 
limiting probabilities. 

Remarks 

(i) For the case v = fx, X = 1, this result can be obtained directly 
from standard closed queuing network results, together with the ob- 
servation regarding persistent states. 

(ii) For this model, there is no absolute priority given to either 
type of customer over the other — the service received by each type is 
determined by parameter values. We can write down the throughput 
of type 1 (respectively type 2) customers, denoted T\ (respectively T 2 ): 

N #r / w \ -1 

Ti = v J p(n, M) = v 



M 

T 2 = n I p(0, m) = X 



N \ -1 M 

i + ( I »-■ ) 2 (Vm)" 

1=1 / 771=0 

M \ -' N 

1 + | Z (A/M)" m I » 

= 1 / 71=0 



(w'i) In view of the earlier observation regarding transient states, we 
point out one aspect of the behavior of this system. As already 
described, in equilibrium, the system will alternate between periods 
during which only customers of one type are processed. The distribu- 
tion of the lengths of these periods can be obtained using a standard 
M/M/l/K (if waiting positions) busy period argument. In the situation 
that v <sc 1 (respectively X «: /t), while each customer type may perceive 
satisfactory (long-term) average throughput, the duration of the period 
during which only type 1 (respectively type 2) customers are served 
can be extremely long. The adverse impact of such behavior on the 
tails of delay distributions is obvious. This phenomenon should be 
taken into account when an application requires good short-term, as 
well as long-term performance. 

(iv) An interesting result for this system is that the delay of one 
type of customer at its higher-priority queue is not influenced by the 
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presence of customers of the other type. For example, the mean delay 
of type 1 customers at the left node is given by 

N lN-\ 



..—" 



for any A, [i, M, corresponding to the ordinary closed queuing network 
result where M would be zero. This invariance is explained as follows. 
If the random variable n is observed only at instants when it satisfies 
n > 0, then it is indistinguishable from its behavior in an ordinary 
queuing network where M would be zero. This is because, in equilib- 
rium, n > implies m = M, and so the type 1 customer sees no 
interference from type 2 customers. But type 1 customer delays at the 
left are only measured when n > and therefore, the distribution of 
delay is unaffected by type 2 customers. 

(v) A variation on model B where one node, say the left, has 
nonpreemptive priorities can be solved by the same technique. In this 
case, the only persistent states are {(/i, m, L): n = or m = M) and 
{(/i, m, H): n > and m > M - 1}. 

V. COMPUTER SYSTEM EXAMPLE 

We now use the results we have developed to evaluate several 
performance issues in a computer system. We consider a simplified 
model of a computer system consisting of a cpu and i/o device. The 
system is primarily intended to process time-critical transactions which 
it does with a multiprogramming level of N. Each transaction makes 
i/o requests requiring a mean service time of 10 ms separated by cpu 
processing of mean duration 5 ms. After a certain number of loops 
between the CPU and i/o device, the transaction is completed and 
leaves the system. At this point, the transaction is considered to 
immediately re-enter the system in accordance with the assumption 
that there is always a backlog of transactions waiting outside the 
system.* 

The transaction workload is clearly i/o-bound and we ask whether 
introduction of CPU-bound batch "filler" or background work at lower 
priority will result in a worthwhile improvement of CPU utilization and, 
consequently, total throughput. Suppose batch jobs are introduced 
with a permitted multiprogramming level of M and that they require 
y seconds of processing on the average between visits to the i/o device 
where mean service time is 10 ms. We again assume that the batch 
multiprogramming level of M is maintained by a backlog of work.* 



* Each transaction or batch job alternately visits the cpu and i/o device, beginning 
with the cpu, ending with the i/o device and looping between the two an arbitrarily 
distributed number of times. Variations on this "scenario" can be modeled using the 
technique in Remark (i). Section III. 
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We will initially assume that the transactions are given preemptive 
priority at both the cpu and i/o device. This arrangement would 
reflect an attitude that the performance of high-priority transaction 
work should not be compromised by introduction of background work. 
Hence, we use model A, and will be assuming that all service times are 
exponentially distributed. Because of this exponential assumption, the 
preemption can be either resume or restart (with resampling). Pre- 
emptive-resume is more appropriate at the CPU, and preemptive-re- 
start (with resampling) is more appropriate at an i/o device such as a 
moving head disk where the data transfer time is typically small 
compared to seek and latency (justifying the restart assumption) and 
service time depends on the physical location of the last interrupting 
request's data (justifying the resampling assumption). To answer the 
question regarding improvement in cpu utilization based on the results 
of Section III, we plot cpu utilization as a function of the batch cpu 
service time y for various N, M. Figure 4a gives the results for (N, M) 
= (2, 0), (2, 2), (2, 10), (2, oo) and (5, 0), (5, 2), (5, 10), (5, oo). When the 
high-priority multiprogramming level N = 2, we see that the batch 
cpu times must be of the order of 10-20 ms before significant improve- 
ment in CPU utilization occurs, and for times in excess of 40 ms, almost 
complete CPU utilization is attained. For the case N = 5, the batch cpu 
times need to be 100-200 ms to get significant improvement, with 320 
ms being the time for almost complete cpu utilization. These results 
can be anticipated using the rule of thumb developed in Remark (v) of 
Section III. For this example, the high-priority traffic experiences a 
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Fig. 4a — cpu utilization as a function of mean CPU batch time for various multipro- 
gramming levels when transactions receive priority at both the CPU and i/o device. 
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bottleneck of strength 2. If this bottleneck is stronger, or the high- 
priority multiprogramming level is larger, our rule of thumb shows 
that the batch work has to be much more strongly CPU-bound to justify 
its introduction. 

In the above arrangement, the batch work needs to be heavily CPU- 
bound to make its introduction worthwhile because batch i/o requests 
encounter the transaction bottleneck at the i/o device. Even though 
a batch job may only need infrequent i/o, it is often prevented from 
continuing by transaction i/o. In such circumstances, one might con- 
sider giving batch jobs high priority for i/o since, with appropriate 
parameters, batch jobs will rarely hold up transactions. The perform- 
ance of such an arrangement is now evaluated using the results for 
model B. Figure 4b gives the results for the same parameters as before 
but with (N, M) = (2, 0), (2, 2), (2, oo) and (5, 0), (5, 2); for this 
arrangement, we must show high-priority cpu utilization as well as 
total CPU utilization, since the former varies with M and y. We observe 
that the introduction of batch jobs at a low multiprogramming level 
can yield a considerable increase in total cpu utilization. This increase 
can be accomplished with only a small effect on transaction throughput 
provided y (the batch cpu service time) is 50 to 100 ms or larger. For 
y comparable or smaller than the transaction cpu service time of 5 ms, 
a large degradation of transaction throughput occurs. If y is smaller 
than 10 ms, then as M increases, transaction throughput approaches 
zero. For certain parameter combinations (e.g., N = 5, y = 100 ms) the 
latter arrangement offers a larger improvement in total cpu utilization 
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Fig. 4b— cpu utilization as a function of mean cpu batch time for various multipro- 
gramming levels when transactions receive cpu priority and batch jobs receive i/o 
priority. 
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than the former, with only minor accompanying degradation of trans- 
action throughput. In general, the extra total throughput offered by 
this "reversed priorities" arrangement must be weighed against any 
deterioration in transaction service, as quantified by the model. We 
mention that there may be a hazard in such a priority scheme since if 
all the batch jobs are simultaneously undergoing an abnormal flurry of 
i/o activity (or have actual mean service times substantially smaller 
than those being modeled) transaction processing might be temporarily 
halted as discussed in Remark (Hi) of Section IV. 

VI. DATA COMMUNICATIONS EXAMPLE 

We consider a full-duplex communication channel terminated at end 
points labeled P and Q by front-end communication processors. We 
assume the following simple transmission protocol. Messages are trans- 
mitted from one endpoint to the other and individual short acknowl- 
edgments are returned in the reverse direction. The acknowledgments 
serve the dual purpose of error control and flow control, and an 
endpoint must stop transmitting when it has a number W of outstand- 
ing acknowledgments. Such a flow control scheme is often referred to 
as window flow control, and W the window length. This protocol has 
been studied by Reiser in the network context using closed queuing 
networks with suitable heuristics to approximate the effect of different 
message sizes at first-come first-served queues and prioritized acknowl- 
edgments. 5 Our models here are less sophisticated with a two-node 
queuing network representing a single full-duplex channel, but they do 
allow some exact results for two chains with different message sizes 
and priority. The questions we seek to answer relate primarily to the 
effect of providing two grades of service between points P and Q. 

We assume that the end points P and Q return an acknowledgment 
as soon as they complete receiving a message. This is tantamount to 
assuming that the front-end processors are fast in comparison to the 
data links and have sufficient memory space so that acknowledgments 
are rarely withheld for purposes of flow control. We are also assuming 
that the data channels are essentially error free and retransmissions 
are rarely needed. Messages and acknowledgments are assumed to 
require an exponentially distributed time for transmission through the 
data channel. This assumption is more reasonable for messages than 
for acknowledgments where it is tolerable since acknowledgments are 
usually relatively short. There are two grades of service available, 
referred to as grades 1 and 2. Grade 1 service is regarded as having 
premium throughput characteristics, whereas grade 2 is designed to 
operate in a background mode to obtain increased use of the channel. 
We consider three configurations, referred to as schemes I— III, which 
are shown in Fig. 5. 
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Fig. 5 — Three schemes to prioritize a data link. 

6. 1 Three schemes to prioritize a data link 
Scheme I 

We consider a grade 1 transfer to be in progress from P to Q and 
evaluate the possibility of introducing grade 2 message flow in the 
same direction. In order that grade 1 service be minimally affected by 
the introduction of grade 2 service, we specify that grade 1 messages 
and acknowledgments receive higher transmission priority than grade 
2 messages and acknowledgments. This configuration is shown in Fig. 
5a. Preemption of message is reasonable in packetized or framed 
transmission where data streams are, or can be, broken into smaller 
parts for transmission. Depending on the implementation, preemption 
of acknowledgments may or may not be possible and we consider both 
possibilities. Hence, we use model A where we identify the left (re- 
spectively right) node with the Q-to-P (respectively P-to-Q) channel 
of the full-duplex link. Suppose an acknowledgment takes a mean of 1 
ms for transmission, a grade 1 message an average of a seconds, and a 
grade 2 message an average of fi seconds. Let the window size for grade 
1 (respectively grade 2) messages be N (respectively M). Then taking, 
for example, N = M = 4, ot = fi = 3 ms and allowing preemption of 
grade 2 acknowledgments, we find that grade 1 throughput is 330.58 
messages/s and grade 2 throughput is 2.72 messages/s, i.e., grade 2 
service accounts for only 0.8 percent of the total utilization of the 
P-to-Q channel. If we disallow preemption of acknowledgments, then 
the model described in Remark (vii) of Section III is applicable and 
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yields 330.56 messages/s and 2.73 messages/s as approximations to the 
grade 1 and 2 throughputs, respectively. The criterion for validity of 
the approximation reads 1/1024 «: 1 in this case. These results are 
easily anticipated since grade 1 messages already utilize 99.2 percent 
of the P-to-Q channel and there is little point in introducing grade 2 
service in the same direction. 

Scheme II 

When there is grade 1 message transfer from P to Q and only 
acknowledgment traffic from Q to P, it would seem worthwhile to 
carry grade 2 messages from Q to P assuming, of course, that there is 
a demand. As before, grade 2 messages receive lower preemptive 
priority and grade 2 acknowledgments lower preemptive or non- 
preemptive priority. This arrangement, shown in Figure 5b, calls for 
use of model A, but this time, with the left (respectively right) node 
identified with the P-to-Q (respectively Q-to-P) channel. We take the 
same definitions for N, M, a, fi and a 1-ms acknowledgment time. Then 
with iV = M=4,a = /? = 3ms and preemption of acknowledgments, 
we find that grade 1 throughput is 330.58 messages/s and grade 2 
throughput is 7.14 messages/s. Without preemption of acknowledg- 
ments, throughputs become 328.99 and 11.87, respectively, and the 
criterion for validity of the approximation reads 1/64 <§c 1. When /? is 
increased to 60 ms, grade 2 throughput (with preemption of acknowl- 
edgments) becomes 5.63 messages/s but the messages are 20 times 
longer so total effective message data throughput is increased by 34 
percent. Without preemption of acknowledgments, throughputs be- 
come 7.79 messages/s for grade 2 and 329.54 messages/s for grade 1. 
This is a 47 percent increase in data throughput. In this case, the 
criterion for validity of the approximation is 1/976 <SC 1. As expected, 
grade 2 service in the opposing direction only attains a significant 
throughput when its messages are much longer than those of grade 1. 
When this is not the case, the grade 1 messages cause an impediment 
to grade 2 acknowledgments that prevents a worthwhile grade 2 
throughput. 

Scheme III 

The priority given to both grade 1 messages and acknowledgments 
in Scheme II reflects a reluctance to allow grade 1 service to be more 
than minimally degraded by grade 2 service. The Q-to-P channel is 
underutilized because the grade 2 acknowledgments suffer the grade 1 
bottleneck in the P-to-Q channel. But since acknowledgments are 
relatively short, we now ask how much grade 2 service improves and 
grade 1 service deteriorates when all acknowledgments receive priority 
over all messages. This arrangement is shown in Fig. 5c and N, M, a, 

QUEUING NETWORKS 1765 



/? are defined as previously. By using the results of model B, a 1-ms 
acknowledgment time and a = fi = 3 ms, grade 1 and 2 throughputs 
are found to be 248 messages/s and both channels are 99.4 percent 
utilized. Introducing grade 2 traffic has yielded a 50-percent increase 
in total traffic carried, but at the expense of a 25-percent degradation 
in grade 1 throughput. The effect of grade 2 acknowledgments on 
grade 1 messages is reduced when /? is increased. For example, when 
P is 6 ms, grade 1 throughput is 292 messages/s and grade 2 throughput 
is 118 messages/s. Now grade 2 service has yielded a 60-percent 
increase in total message data throughput at the expense of a grade 1 
throughput degradation of 12 percent. The introduction of grade 2 
service has caused grade 1 message delay to increase from 10.65 ms to 
12.30 ms and grade 1 acknowledgment delay remains at 1.45 ms (see 
Remark (iv) of Section IV). Utilization of the P-to-Q (respectively Q 
to P) channel is now 99.3 percent (respectively 99.95 percent). 

As already stated in Section IV, a system with priorities allocated in 
such a way will tend to alternate between periods where customers of 
only one type (grade in this case) are processed. Hence, for this scheme, 
we need to make certain that during periods when grade 2 service is 
occurring grade 1 performance is not significantly disrupted in the 
short term. In this case, the fact that the 1-ms acknowledgment time 
is considerably shorter than /?, makes it unlikely that a second grade 
2 message will complete transmission before the acknowledgment from 
the former message is returned and, hence, that grade 1 service will be 
able to continue without an intolerably long delay. On the other hand, 
if the grade 2 source were to send a long sequence of very short 
messages, grade 1 communications could be interrupted for a consid- 
erable period. In an actual implementation, it might be desirable to 
incorporate a mechanism to prevent this occurrence. 

Although we have only examined a limited set of parameter values, 
we can summarize the results of this section. In the absence of rather 
extreme traffic parameters, there is little justification for introduction 
of a lower grade of service which operates essentially in a background 
mode. On the other hand, if some degradation of the premium service 
grade is tolerable, then otherwise unused channel capacity can carry 
an appreciable amount of lower-grade traffic. 

VII. COMPARISON WITH AN APPROXIMATION TECHNIQUE 

Our final application will be an evaluation of a convenient and 
commonly used approximation technique for handling priorities. 3,4,5 
The technique considers the low-priority customers at a node to have 
a dedicated server of rate y.L (1 — Ph), where (II is the low-priority 
service rate and pw is the utilization due to high-priority customers at 
that node. As noted in Ref. 5, this approximation is justifiable when 
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the interruptions caused by high-priority traffic are frequent but of 
short duration. This suggests a criterion (sufficient condition) for 
satisfactory accuracy of the approximation: the high-priority busy 
cycle length at a node should be short in comparison with the low- 
priority service time at the same node. 

Table I shows some results for model A with various parameter 
combinations N, M, v, n, \. We tabulate the exact throughput T and 
mean delay D at each node for the low-priority customers using the 
results of Section II, and the approximations to these quantities based 
on the above approximation technique. We also tabulate the high- 
priority mean busy cycle B at each node to enable comparison of 
approximation accuracy with degree of satisfaction of the above cri- 
terion. For T, D, B the subscripts H, L distinguish high and low priority 
and I, r distinguish left and right nodes. Note that B H .i = (v - v~ N )/ 
(v - 1), B H ,= (v~ l ~ v N )/(l - v). 

In Table I note that when the criterion is satisfied at both nodes 
(cases 1, 2, 6, 7), the approximation is quite successful with errors of 
less than 2 percent. When the criterion is violated at one node only 
(cases 3, 8, 9), the approximate results might be regarded as satisfactory 
or unsatisfactory, depending on one's viewpoint. When the criterion is 
violated at both nodes (cases 4, 5, 10), both approximate throughputs 
and delays show large errors. Case 5 reflects a rather extreme choice 
of parameters and is included only to show the large errors which are 
theoretically possible. 

Another vehicle for examining the effectiveness of the approximation 
technique is to compare it with the exact results for a homogeneous 
open network as considered in Section II. The approximation yields 
an expression for class i delay (including service time) at node J* of 



Di * ay 1 / 1 - 2 9) 



p 



k=i 



which, in comparison with the exact result, is seen to be too small by 
a factor of 

1- I p}. 



k-i+l 



Hence, for this type of network we would anticipate significant error 
if the approximation were applied to a priority class when higher- 
priority classes utilize a significant portion of a node's processing 
capacity. Indeed, the homogeneous network is a challenging test of the 
approximation technique since interruptions of a customer's service 
are of a duration at least comparable with the service time; our earlier 
criterion is never satisfied for such a network. 
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VIII. CONCLUSIONS 

We have seen that the analysis of queuing networks is somewhat 
involved when local balance is not satisfied but that some useful results 
can still be obtained. It is clear that further results are needed to 
extend the applicability of these models. Section VII shows that further 
attention should also be directed towards establishing and improving 
the range of validity of existing approximation techniques. 
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